智能论文笔记

Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models

Ismail Sadiq , Erick A. Perez-Alday , Amit J. Shah , Ali Bahrami Rad , Reza Sameni , Gari D. Clifford

分类：机器学习

2021-12-28

目的：确定逼真，但是电磁图的计算上有效模型可用于预先列车，具有广泛的形态和特定于给定条件的形态和异常 - T波段（TWA）由于创伤后应激障碍，或重点 - 在稀有人的小型数据库上显着提高了性能。方法：使用先前经过验证的人工ECG模型，我们生成了180,000人的人工ECG，有或没有重要的TWA，具有不同的心率，呼吸率，TWA幅度和ECG形态。在70,000名患者中培训的DNN进行分类为25种不同的节奏，将输出层修改为二进制类（TWA或NO-TWA，或等效，PTSD或NO-PTSD），并对人工ECG进行转移学习。在最终转移学习步骤中，DNN在ECG的培训和交叉验证，从12个PTE和24个控件，用于使用三个数据库的所有组合。主要结果：通过进行转移学习步骤，使用预先培训的心律失常DNN，人工数据和真实的PTSD相关的心电图数据，发现了最佳性能的方法（AUROC = 0.77，精度= 0.72，F1-SCATE = 0.64）。从训练中删除人工数据导致性能的最大下降。从培训中取出心律失常数据提供了适度但重要的，表现下降。最终模型在人工数据上显示出在性能下没有显着下降，表明没有过度拟合。意义：在医疗保健中，通常只有一小部分高质量数据和标签，或更大的数据库，质量较低（和较差的相关）标签。这里呈现的范式，涉及基于模型的性能提升，通过在大型现实人工数据库和部分相关的真实数据库上传输学习来提供解决方案。

translated by 谷歌翻译

The CirCor DigiScope Dataset: From Murmur Detection to Murmur Classification

Jorge Oliveira , Francesco Renna , Paulo Dias Costa , Marcelo Nogueira , Cristina Oliveira , Carlos Ferreira , Alipio Jorge , Sandra Mattos , Thamine Hatem , Thiago Tavares

分类：机器学习

2021-08-02

心脏听诊是用于检测和识别许多心脏病的最具成本效益的技术之一。基于Auscultation的计算机辅助决策系统可以支持他们的决定中的医生。遗憾的是，在临床试验中的应用仍然很小，因为它们中的大多数仅旨在检测音盲局部信号中的额外或异常波的存在，即，仅提供二进制地面真理变量（普通VS异常）。这主要是由于缺乏大型公共数据集，其中存在对这种异常波（例如，心脏杂音）的更详细描述。为基于听诊的医疗建议系统铺平了更有效的研究，我们的团队准备了目前最大的儿科心声数据集。从1568名患者的四个主要听诊位置收集了5282个录音，在此过程中，手动注释了215780人的心声。此外，并且首次通过专家注释器根据其定时，形状，俯仰，分级和质量来手动注释每个心脏杂音。此外，鉴定了杂音的听诊位置以及杂音更集中检测到杂音的位置位置。对于相对大量的心脏声音的这种详细描述可以为新机器学习算法铺平道路，该算法具有真实世界的应用，用于检测和分析诊断目的的杂波。

translated by 谷歌翻译

ProductGraphSleepNet: Sleep Staging using Product Spatio-Temporal Graph Learning with Attentive Temporal Aggregation

Aref Einizade , Samaneh Nasiri , Sepideh Hajipour Sardouie , Gari Clifford

分类：机器学习

2022-12-09

The classification of sleep stages plays a crucial role in understanding and diagnosing sleep pathophysiology. Sleep stage scoring relies heavily on visual inspection by an expert that is time consuming and subjective procedure. Recently, deep learning neural network approaches have been leveraged to develop a generalized automated sleep staging and account for shifts in distributions that may be caused by inherent inter/intra-subject variability, heterogeneity across datasets, and different recording environments. However, these networks ignore the connections among brain regions, and disregard the sequential connections between temporally adjacent sleep epochs. To address these issues, this work proposes an adaptive product graph learning-based graph convolutional network, named ProductGraphSleepNet, for learning joint spatio-temporal graphs along with a bidirectional gated recurrent unit and a modified graph attention network to capture the attentive dynamics of sleep stage transitions. Evaluation on two public databases: the Montreal Archive of Sleep Studies (MASS) SS3; and the SleepEDF, which contain full night polysomnography recordings of 62 and 20 healthy subjects, respectively, demonstrates performance comparable to the state-of-the-art (Accuracy: 0.867;0.838, F1-score: 0.818;0.774 and Kappa: 0.802;0.775, on each database respectively). More importantly, the proposed network makes it possible for clinicians to comprehend and interpret the learned connectivity graphs for sleep stages.

translated by 谷歌翻译

Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic Reasoning

Thanh Le-Cong , Duc-Minh Luong , Xuan Bach D. Le , David Lo , Nhat-Hoa Tran , Bui Quang-Huy , Quyet-Thang Huynh

分类：机器学习

2023-01-03

In this paper, we propose a novel technique, namely INVALIDATOR, to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. INVALIDATOR reasons about program semantic via program invariants while it also captures program syntax via language semantic learned from large code corpus using the pre-trained language model. Given a buggy program and the developer-patched program, INVALIDATOR infers likely invariants on both programs. Then, INVALIDATOR determines that a APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains errors behaviors of the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of INVALIDATOR is three-fold. First, INVALIDATOR is able to leverage both semantic and syntactic reasoning to enhance its discriminant capability. Second, INVALIDATOR does not require new test cases to be generated but instead only relies on the current test suite and uses invariant inference to generalize the behaviors of a program. Third, INVALIDATOR is fully automated. We have conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that INVALIDATOR correctly classified 79% overfitting patches, accounting for 23% more overfitting patches being detected by the best baseline. INVALIDATOR also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively.

translated by 谷歌翻译

SIRL: Similarity-based Implicit Representation Learning

Andreea Bobu , Yi Liu , Rohin Shah , Daniel S. Brown , Anca D. Dragan

分类：机器人 | 人工智能 | 机器学习

2023-01-02

When robots learn reward functions using high capacity models that take raw state directly as input, they need to both learn a representation for what matters in the task -- the task ``features" -- as well as how to combine these features into a single objective. If they try to do both at once from input designed to teach the full reward function, it is easy to end up with a representation that contains spurious correlations in the data, which fails to generalize to new settings. Instead, our ultimate goal is to enable robots to identify and isolate the causal features that people actually care about and use when they represent states and behavior. Our idea is that we can tune into this representation by asking users what behaviors they consider similar: behaviors will be similar if the features that matter are similar, even if low-level behavior is different; conversely, behaviors will be different if even one of the features that matter differs. This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not. The notion of learning representations based on similarity has a nice parallel in contrastive learning, a self-supervised representation learning technique that maps visually similar data points to similar embeddings, where similarity is defined by a designer through data augmentation heuristics. By contrast, in order to learn the representations that people use, so we can learn their preferences and objectives, we use their definition of similarity. In simulation as well as in a user study, we show that learning through such similarity queries leads to representations that, while far from perfect, are indeed more generalizable than self-supervised and task-input alternatives.

translated by 谷歌翻译

G-CEALS: Gaussian Cluster Embedding in Autoencoder Latent Space for Tabular Data Representation

Manar D. Samad , Sakib Abrar

分类：机器学习 | 人工智能

2023-01-02

The latent space of autoencoders has been improved for clustering image data by jointly learning a t-distributed embedding with a clustering algorithm inspired by the neighborhood embedding concept proposed for data visualization. However, multivariate tabular data pose different challenges in representation learning than image data, where traditional machine learning is often superior to deep tabular data learning. In this paper, we address the challenges of learning tabular data in contrast to image data and present a novel Gaussian Cluster Embedding in Autoencoder Latent Space (G-CEALS) algorithm by replacing t-distributions with multivariate Gaussian clusters. Unlike current methods, the proposed approach independently defines the Gaussian embedding and the target cluster distribution to accommodate any clustering algorithm in representation learning. A trained G-CEALS model extracts a quality embedding for unseen test data. Based on the embedding clustering accuracy, the average rank of the proposed G-CEALS method is 1.4 (0.7), which is superior to all eight baseline clustering and cluster embedding methods on seven tabular data sets. This paper shows one of the first algorithms to jointly learn embedding and clustering to improve multivariate tabular data representation in downstream clustering.

translated by 谷歌翻译

Skew Class-balanced Re-weighting for Unbiased Scene Graph Generation

Haeyong Kang , Chang D. Yoo

分类：机器学习

2023-01-01

An unbiased scene graph generation (SGG) algorithm referred to as Skew Class-balanced Re-weighting (SCR) is proposed for considering the unbiased predicate prediction caused by the long-tailed distribution. The prior works focus mainly on alleviating the deteriorating performances of the minority predicate predictions, showing drastic dropping recall scores, i.e., losing the majority predicate performances. It has not yet correctly analyzed the trade-off between majority and minority predicate performances in the limited SGG datasets. In this paper, to alleviate the issue, the Skew Class-balanced Re-weighting (SCR) loss function is considered for the unbiased SGG models. Leveraged by the skewness of biased predicate predictions, the SCR estimates the target predicate weight coefficient and then re-weights more to the biased predicates for better trading-off between the majority predicates and the minority ones. Extensive experiments conducted on the standard Visual Genome dataset and Open Image V4 \& V6 show the performances and generality of the SCR with the traditional SGG models.

translated by 谷歌翻译

Lightmorphic Signatures Analysis Toolkit

D. Damian

分类：机器学习

2022-12-31

In this paper we discuss the theory used in the design of an open source lightmorphic signatures analysis toolkit (LSAT). In addition to providing a core functionality, the software package enables specific optimizations with its modular and customizable design. To promote its usage and inspire future contributions, LSAT is publicly available. By using a self-supervised neural network and augmented machine learning algorithms, LSAT provides an easy-to-use interface with ample documentation. The experiments demonstrate that LSAT improves the otherwise tedious and error-prone tasks of translating lightmorphic associated data into usable spectrograms, enhanced with parameter tuning and performance analysis. With the provided mathematical functions, LSAT validates the nonlinearity encountered in the data conversion process while ensuring suitability of the forecasting algorithms.

translated by 谷歌翻译

Detecting Change Intervals with Isolation Distributional Kernel

Yang Cao , Ye Zhu , Kai Ming Ting , Flora D. Salim , Hong Xian Li , Gang Li

分类：机器学习

2022-12-30

Detecting abrupt changes in data distribution is one of the most significant tasks in streaming data analysis. Although many unsupervised Change-Point Detection (CPD) methods have been proposed recently to identify those changes, they still suffer from missing subtle changes, poor scalability, or/and sensitive to noise points. To meet these challenges, we are the first to generalise the CPD problem as a special case of the Change-Interval Detection (CID) problem. Then we propose a CID method, named iCID, based on a recent Isolation Distributional Kernel (IDK). iCID identifies the change interval if there is a high dissimilarity score between two non-homogeneous temporal adjacent intervals. The data-dependent property and finite feature map of IDK enabled iCID to efficiently identify various types of change points in data streams with the tolerance of noise points. Moreover, the proposed online and offline versions of iCID have the ability to optimise key parameter settings. The effectiveness and efficiency of iCID have been systematically verified on both synthetic and real-world datasets.

translated by 谷歌翻译

Current State of Community-Driven Radiological AI Deployment in Medical Imaging

Vikash Gupta , Barbaros Selnur Erdal , Carolina Ramirez , Ralf Floca , Laurence Jackson , Brad Genereaux , Sidney Bryson , Christopher P Bridge , Jens Kleesiek , Felix Nensa

分类：人工智能

2022-12-29

Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.

translated by 谷歌翻译